Are certain Genre’s more sad than others? Can we see this with regression?
It does not seem that there is sufficient multicollinearity to negate the typical linear model assumption.
Call:
lm(formula = .outcome ~ ., data = dat)
Residuals:
Min 1Q Median 3Q Max
-0.57938 -0.12137 -0.01451 0.11287 0.61145
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3179539 0.0899796 -3.534 0.000425 ***
danceability 0.4978225 0.0322873 15.419 < 2e-16 ***
energy 0.6104190 0.1056897 5.776 9.66e-09 ***
loudness -0.0106577 0.0082277 -1.295 0.195441
acousticness 0.2334146 0.1131236 2.063 0.039284 *
speechiness 0.1772236 0.0455258 3.893 0.000104 ***
tempo 0.0001279 0.0001757 0.728 0.466931
instrumentalness -0.0488339 0.0204058 -2.393 0.016851 *
liveness 0.0200725 0.0354814 0.566 0.571685
key 0.0013089 0.0013788 0.949 0.342639
`energy:loudness` 0.0174139 0.0122201 1.425 0.154398
`energy:acousticness` -0.0961812 0.1766591 -0.544 0.586232
`loudness:acousticness` 0.0080406 0.0096553 0.833 0.405137
`energy:loudness:acousticness` 0.0048689 0.0161302 0.302 0.762814
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1835 on 1254 degrees of freedom
Multiple R-squared: 0.3451, Adjusted R-squared: 0.3383
F-statistic: 50.83 on 13 and 1254 DF, p-value: < 2.2e-16
It loos like danceability, energy, acousticness, and instrumentalness are significant predictors of the joyfulness/sadness of a song. Since the coefficient of instrumentalness is negative, it can be said that a song which has more instrumentalness is typically a sadder song according to the model. Unfortunately these predictors only explain 33% of the variance in valence.
Black line is an LOESS line and the dark red is just an Linear model to show the general trend.